A latent variable model for joint pause prediction and dependency parsing
نویسندگان
چکیده
The prosody of speech is closely related to syntactic structure of the spoken sentence, and thus analysis models that jointly consider these two types of information are promising. However, manual annotation of syntactic information and prosodic information such as pauses is laborious, and thus it can be difficult to obtain sufficient data to train such joint models. In this paper, we tackle this problem by introducing a joint pause prediction and dependency parsing model that treats pauses between consecutive words as latent variables. Using this model, it is possible to learn from not only data labeled with both syntax and pause information, but also data labeled with only syntactic information, which can be obtained in larger quantities. Experiments find that a joint pause prediction and dependency parsing model obtains better pause prediction F-measure than a decision-tree-based baseline trained on the same data, and that the addition of more data using the proposed latent variable model leads for further gains of up to 11.6 points in F-measure.
منابع مشابه
An improved joint model: POS tagging and dependency parsing
Dependency parsing is a way of syntactic parsing and a natural language that automatically analyzes the dependency structure of sentences, and the input for each sentence creates a dependency graph. Part-Of-Speech (POS) tagging is a prerequisite for dependency parsing. Generally, dependency parsers do the POS tagging task along with dependency parsing in a pipeline mode. Unfortunately, in pipel...
متن کاملFast and Robust Multilingual Dependency Parsing with a Generative Latent Variable Model
We use a generative history-based model to predict the most likely derivation of a dependency parse. Our probabilistic model is based on Incremental Sigmoid Belief Networks, a recently proposed class of latent variable models for structure prediction. Their ability to automatically induce features results in multilingual parsing which is robust enough to achieve accuracy well above the average ...
متن کاملA Latent Variable Model for Generative Dependency Parsing
We propose a generative dependency parsing model which uses binary latent variables to induce conditioning features. To define this model we use a recently proposed class of Bayesian Networks for structured prediction, Incremental Sigmoid Belief Networks. We demonstrate that the proposed model achieves state-of-the-art results on three different languages. We also demonstrate that the features ...
متن کاملMultilingual Joint Parsing of Syntactic and Semantic Dependencies with a Latent Variable Model
Current investigations in data-driven models of parsing have shifted from purely syntactic analysis to richer semantic representations, showing that the successful recovery of the meaning of text requires structured analyses of both its grammar and its semantics. In this article, we report on a joint generative history-based model to predict the most likely derivation of a dependency parser for...
متن کاملA Latent Variable Model of Synchronous Syntactic-Semantic Parsing for Multiple Languages
Motivated by the large number of languages (seven) and the short development time (two months) of the 2009 CoNLL shared task, we exploited latent variables to avoid the costly process of hand-crafted feature engineering, allowing the latent variables to induce features from the data. We took a pre-existing generative latent variable model of joint syntacticsemantic dependency parsing, developed...
متن کامل